Ultrasound Elastography in Inflammatory Bowel Diseases: A Systematic Review of Accuracy Compared with Histopathological Assessment

Abstract Background and Aims Ultrasound elastography [USE] is an innovative, non-invasive, promptly available, ancillary technique that has been proposed in the evaluation of intestinal fibrosis as a monitorable biomarker, in terms of stiffness. The non-invasive estimate of fibrosis by USE appears appealing for dedicated physicians, in order to optimise the treatments for inflammatory bowel disease [IBD] patients [surgical vs non-surgical]. We aimed to systematically review literature evidence on ultrasound elastography in IBD patients. Methods For this qualitative systematic review, we searched PubMed, EMBASE, and Scopus to identify all studies, published until October 2021, investigating the application of USE in IBD patients compared with histopathological assessment. Results Overall, 12 papers published between 2011 and 2019 were included. A total of 275 IBD patients were included: 272 Crohn’s disease [CD] [98.9%] and three ulcerative colitis [UC] [1.1%]. Seven [58.3%] and four [41.6%] studies investigated strain elastography [SE] and shear wave elastography [SWE], respectively; in one study [0.1%] both techniques were addressed. The histological evaluation was largely conducted on surgical specimens and in two studies endoscopic biopsies were also included. The histological assessment was semi-quantitative in all the included studies, except for two where the fibrosis was evaluated only qualitatively. In 10/12 publications USE could accurately distinguish inflammation from fibrosis in the examined bowel tracts. Conclusions From the preliminary available data, an overall moderate-to-good accuracy of USE in detecting histological fibrosis [10/12 studies] was found. Point-shear wave elastography has been shown to perform superiorly. Further studies are needed to confirm these evidences.


Introduction
Chronic inflammatory bowel diseases [IBDs] are relapsingremitting and progressive conditions that lead to irreversible bowel damage. 1,2 Especially the stricturing phenotype of Crohn's disease [CD] and late stages of ulcerative colitis [UC] are characterised by the development of fibrosis in the affected bowel tract. 3,4 Fibrotic strictures have a multifactorial biological basis that involves the activation of mesenchymal cells that overproduce and deposit extracellular matrix. 3 Soluble molecules such as cytokines and growth factors [i.e., transforming growth factors, tumour necrosis factor, interleukins] trigger this activation, with a subsequent remodelling of the tissue by matrix metalloproteinases and other fibrogenic enzymes. 4,5 The predominancy of fibrosis is believed to be less responsive to medical treatments and often requires a surgical intervention [i.e., resection, strictureplasty]. 6,7 For this reason, distinguishing between IBD patients with a primarily inflammatory or a fibrotic disease has a relevant impact on clinical management.
In a recent systematic review, the sensitivity in detecting fibrosis of cross-sectional imaging techniques has been assessed at around 80% for both computed tomography [CT] and magnetic resonance imaging [MRI]. 8 The main features that help the radiologists in distinguishing between inflammation and fibrosis are bowel wall thickness, mural contrast enhancement, mesenteric vascularity, and mesenteric fat stranding. [9][10][11] To date, neither a scoring system nor standardised criteria have been established for differentiating fibrosis at cross-sectional imaging, thus remaining an unsolved challenge for dedicated physicians.
Ultrasound elastography [USE] is an innovative, non-invasive, promptly available, ancillary technique that has been proposed in the evaluation of intestinal fibrosis as a monitorable imaging biomarker, in terms of stiffness. 8 As concerns technical aspects, USE assesses the elastic properties of soft tissues by acoustic or mechanical stimulation: the tissue response to the stress is processed and codified as an image with a scale of colours or as a quantitative measurement corresponding to the estimated stiffness value. The main types of USE are shear wave elastography [SWE] and strain elastography [SE]. The stimulus for the measured stress ranges from acoustic radiation force impulse imaging [ARFI] to mechanical or physiological palpation. In detail, point-SWE [pSWE] estimates a quantitative value of a specific point of the examined tissue, whereas two-dimensional SWE [2D-SWE] codifies a colour map that reflects the stiffness of a wider portion of the examined tissue. The application of USE has already been incorporated in the diagnostic algorithms of diseases of the liver, breast, pancreas, and thyroid, especially for neoplastic lesions. 12,13 Thanks to recent technological advancements, USE is implemented and usable in real time. However, there are no current international guidelines instructing on the applications of elastography in the field of IBD.
The role of USE in the management of IBD is currently under investigation, and its validation requires precise knowledge of the corresponding histological features. So far, data from literature on USE accuracy in the field of IBD are mostly derived from small cohorts and have never been comprehensively reviewed specifically and exclusively in comparison with histology as a reference standard. The purpose of our systematic review is to provide an exhaustive overview of the available data on USE in IBD patients.

Technique and principles of ultrasound elastography
Elastography evaluates the tissue elasticity, defined as the tendency of that tissue to resist deformations by an applied force, or to return to its original shape once the force is removed. Biologically, a stiff region displays less deformation compared with healthy surrounding tissue when the same stress stimulus is applied. The technologies currently used and commercially available in US machines are divided into two main types: strain [SE] and shear wave elastography [SWE] [ Figure  1]. These types of elastography differ in the process used to measure tissue deformation in response to an applied force.
In detail, the applied force could be a mechanical internal [exploiting physiological periodic compression induced by circulatory and/or respiratory motion] or external [generated by hand through the US transducer that is gently pressed against explored tissues] pressure [ Figure 1]. Alternatively, tissues can be stressed by imposing a low-frequency ARFI stimulus generated by the US device itself. In SE, the induced tissue displacement is traced between pairs of echo frames and then the strain is calculated from their gradient. Through the use of a colour map, the different strains are encoded within a twodimensional image that can be instantly visualised together with the conventional B-mode US image. The SE is a semiquantitative technique that cannot measure the elasticity of the examined tissue as an absolute value, since the absolute value of the applied stress is unknown.
In SWE, the dynamic stress induces shear waves that propagate perpendicular to the US beam. The speed of the generated shear waves is measured and returns quantitative estimates of the tissue elasticity. Technically, two SWE methods can be distinguished: the point-SWE [pSWE] and the 2-dimensional-SWE [2D-SWE]. In the p-SWE, the speed of the shear wave is measured in a single specific location [ROI]; the 2D-SWE produces a quantified colour map of the distribution of shear wave velocities in a wider region.

Methods
This work was conducted in accordance with the Cochrane Handbook 14 and Preferred Reporting Items for Systematic Reviews and Meta-Analyses [PRISMA] recommendations for reporting systematic reviews. 15

Data sources and search strategy
We designed a comprehensive search strategy and searched PubMed/MEDLINE, Embase, and Scopus up to October 2021 to identify eligible studies. A hand-search of abstracts from the annual meetings of Digestive Disease Week, the American College of Gastroenterology, the European Crohn's and Colitis Organisation, and the United European Gastroenterology Week, up to 2021, was also performed.
The search query employed both an exhaustive list of keywords and index terminology whenever possible.

Selection process, data extraction, and quality assessment
Two review authors [FF, ADB] independently screened the titles and abstracts yielded by the search. Full reports were obtained for all titles that appeared to meet the inclusion criteria or where there was any uncertainty. Disagreements were resolved through collegial discussion. The reasons for excluding trials were recorded. When there were multiple articles for a single study, the latest publication was used. The studies were reviewed for patients' selection and features, technical aspects, USE, and histological assessment. When the USE assessment was done through classes based on the analysis of qualitative colour maps, it was considered semi-quantitative; when the USE measurements were reported as absolute values, it was considered a quantitative assessment. Finally, when the USE measurements were not ordered into classes of severity, it was considered as a qualitative assessment. The quality of the included studies was assessed with the Quality Assessment of Diagnostic Accuracy Studies [QUADAS-2] checklist. 16 This tool includes four domains: patient selection, index test, reference standard, and flow and timing. The risk of bias is evaluated across all four domains, and the first three domains are also assessed in terms of concerns regarding applicability. The QUADAS-2 allows expression of an overall judgment as 'low risk of bias' or 'low concern regarding applicability' in case of assignment of 'low' to most/all domains relating to bias or applicability. If a study is judged 'high' or 'unclear' regarding one or more domains, then it may be judged 'at risk of bias' or as having 'concerns regarding applicability'.

Accuracy of elastography in detecting fibrosis
Overall, an accurate differentiation of inflammatory from prevalently fibrotic intestinal tracts, compared with histology, was found in all the included papers, 17,[19][20][21][22][24][25][26][27][28] except for the ones by Havre et al. and Serra et al. 18,23 These authors concluded in their studies that USE could not accurately distinguish the grade of inflammation from fibrosis. 18,23 As concerns USE accuracy in detecting fibrosis, point-SWE was found to perform better compared with SE and ARFI by Ding et al. 26 Taking together the assessments of USE in all the included studies, their accuracy varied from 35 to 91%. 17 In detail, Baumgart et al. observed significantly higher mean strain ratios in unaffected compared with affected intestinal tracts [mean +/-standard deviation, 77.1 +/-21.4 vs 13.3 +/-11.2, p <0.001]. 20 In this study, the affected tracts displayed increased collagen deposition, also significantly associated with USE assessments [p <0.001]. 20 Fraquelli et al.,

Discussion
This systematic review illustrates the present understanding of the capability of USE in detecting and quantifying, whenever possible, the degree of fibrosis within the bowel wall of IBD patients. Collectively, the analysis of the published literature testifies an overall shared moderate-to-good accuracy of USE in detecting histological fibrosis [10/12 studies]. 17,[19][20][21][22][24][25][26][27][28] In detail, the accuracy of USE varied from 35% to 91% in all the included studies. 17 However, important concerns are raised regarding the heterogeneity of the USE modalities investigated [SE, SWE], in terms of both the input application/stimulus and the biomarkers analysed [i.e., strain ratio, pSWE, etc.] that do not allow formulation of unequivocal accuracy data. In particular, since SE only allows semi-quantitative assessments of stiffness, these are difficult to compare longitudinally. With respect to technical aspects, all the studies included had a limited cohort of patients [from 1 to 105 patients] and the US devices used were of different manufacturers, the Philips iU22 being the most used 21,23,25 [ Table 1]. These methodological gaps might be only overcome by multicentre studies adopting a common USE equipment.
A further explanation of the variation of USE accuracy observed in our systematic review is the heterogeneity of the bowel segments analysed in the included studies. Indeed, the studies investigating exclusively the ileum in CD patients, where the pathological processes involve the whole bowel wall, reported higher rates of accuracy and better correlation between USE measurements and histology. 17,21,22 Possible selection bias must be additionally addressed: the investigation of USE in advanced stenosis candidates for surgical resection might have returned higher rates of tissue fibrosis, possibly enhancing the accuracy of USE. The incorporation of a control group and the inclusion of different stages of disease in the study design are necessary to reduce this kind of bias.
The main strength of our systematic review was to address the accuracy of USE in detecting fibrosis exclusively in comparison with histological assessment. To our knowledge, this inclusion criterion was never adopted by previous systematic reviews on the topic of USE in IBD. [29][30][31][32] Of note, the definition of fibrosis was univocally adopted by all the studies, but the histological assessment varied from semi-quantitative [17][18][19][20][21][22][23][24][25][26] to merely qualitative, 27,28 thus limiting the uniformity of the data and a direct comparison between the studies. A further main finding emerges from our analysis, which is the lack of a standardised histological score to quantify fibrosis. Indeed, a strong discrepancy with respect to the reference standard adopted [i.e., also endoscopic biopsies, which limits the proper assessment of the submucosal fibrotic changes] and the histological quantification of fibrosis was found between the included studies.
It appears clear that USE has been more extensively investigated in CD than in UC [only one study included, three UC patients], but with the growing adoption of US also in the monitoring of UC this trend will reasonably change in the near future.
Interestingly, several publications suggested the integration of USE with conventional B-mode US and CEUS in order to gain greater accuracy. [21][22][23][24][25]27,28 Indeed, this concept has been broadly explored and is well known by experienced bowel sonographers who are used to combining different qualitative and quantitative features within activity scores. 33,34 The matter of operator dependency remains for USE, as for all ultrasonographic methods. When estimated, there was a moderate-to-high inter-reader agreement in SE measurements, 21,25 and we can speculate that inter-reader agreement might be superior in the case of SWE. A consensus on specific skills and training for USE operators is yet to be specifically established.
Another relevant issue regards the quality assessment of the included studies [ Table 2]. 16 Indeed, despite addressing the main methodological features of scientific studies, many limitations could not be addressed [i.e., the exiguous sample size, the lack of validation, and reproducibility] [ Table 2].
The main limitation of this work is that no meta-analysis was performed due to the lack of standardisation between the results of the included studies; further limitations derive from In our view, the so far gathered data on USE deserve endorsement to be incorporated into the management algorithms of IBD, whereas USE does not appear to add any specific information to guide clinical decisions. Indeed, current European Federation of Societies for Ultrasound in Medicine and Biologys guuidelines instruct on USE with a relatively low level of evidence and suggest using it to characterise bowel wall lesions exclusively in CD. 35 The appeal of USE lies in its non-invasiveness and repeatability. Indeed, in the treat-to-target era a new physiological surrogate endpoint, such as the quantification of the intestinal fibrosis, would be warmly welcome by dedicated physicians.
Our systematic review endorses that elastography cannot replace the tissue specimen yet, at least in the field of bowel ultrasound and IBD. The applicability of this technique to the bowel wall, compared with parenchymal organs might be limited and challenged by the unique features of the intestine [i.e., peristalsis, the peritoneum, the structure in layers].
In conclusion, despite the data gathered so far, the role of USE in the detection and quantification of fibrosis in IBD patients requires additional research with properly designed randomised clinical trials. Moreover, long-term data on patients followed up with USE longitudinally over time are warranted as well.
The data underlying this article are available in its online Supplementary material and upon request to the corresponding author.

Funding
This paper was not funded.